This Jupyter Notebook describes all the code required to the Asset pricing workshop 2 exercise 1. Presented by: Miguel Ortiz, Julián Alemán y Oscar Arenas.
# Importing packages required
import pandas as pd
import numpy as np
import pyreadstat
import pyarrow.feather as feather
import datetime
from datetime import date
import math
import scipy.stats
import statsmodels.regression.linear_model as lm
import statsmodels.tools.tools as ct
pd.options.display.float_format = '{:,.4f}'.format # Reduce decimal points to 4
pd.pandas.set_option('display.max_columns', None)
# Visualisation
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
# Reading the database with the ETF and KF Factors values
df = pd.read_excel('DATOS Taller 2 AP 2022.xlsx', sheet_name='ETF_DB',index_col=0)
# df = pd.read_excel('C:/Users/Miguel Angel/OneDrive - Universidad EAFIT/2022-1/Asset pricing/Workshop/Workshop 2/DATOS Taller 2 AP 2022.xlsx', sheet_name='ETF_DB',index_col=0)
# Data exploration - visualization
df_plot = df.reset_index()
df_plot2=pd.melt(df_plot, id_vars =['DATE'], value_vars =['VTV', 'VGT', 'VB', 'USMV', 'PNQI', 'MTUM', 'XBI', 'SPHQ', 'ESGU','SECT', 'VUG', 'LRGF', 'ERP', 'SMB', 'HML', 'RF', 'MOM'])
fig = px.line(df_plot2, x="DATE", y="value", color='variable', title='ETFs values')
fig.show()
df_plot3=pd.melt(df_plot, id_vars =['DATE'], value_vars =['ERP', 'SMB', 'HML', 'RF', 'MOM'])
fig2 = px.line(df_plot3, x="DATE", y="value", color='variable', title='Factors values', color_discrete_sequence=px.colors.qualitative.Dark2)
fig2.show()
Analizing the data it is shown thath the ETF with the highest value at the end of the analized time is the VGT which is the Vanguard Information Technology Index Fund ETF Shares which follow the MSCI US Investable Market Index (IMI)/Information Technology 25/50 returns by a complete reply. Also, ti can be seen that the less volatile factor is the SMB.
# To drop objets used
del [df_plot, fig, fig2, df_plot2, df_plot3]
# Database structure
df.info()
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 121 entries, 2012-02-29 to 2022-02-28 Data columns (total 17 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 VTV 121 non-null float64 1 VGT 121 non-null float64 2 VB 121 non-null float64 3 USMV 121 non-null float64 4 PNQI 121 non-null float64 5 MTUM 107 non-null float64 6 XBI 121 non-null float64 7 SPHQ 121 non-null float64 8 ESGU 63 non-null float64 9 SECT 54 non-null float64 10 VUG 121 non-null float64 11 LRGF 83 non-null float64 12 ERP 121 non-null float64 13 SMB 121 non-null float64 14 HML 121 non-null float64 15 RF 121 non-null float64 16 MOM 121 non-null float64 dtypes: float64(17) memory usage: 17.0 KB
It is noticed that for some variables there few missing values such as: MTUM (14), ESGU (58), SECT (-67), and LRGF (38).
# Database main stats
df.describe()
| VTV | VGT | VB | USMV | PNQI | MTUM | XBI | SPHQ | ESGU | SECT | VUG | LRGF | ERP | SMB | HML | RF | MOM | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 121.0000 | 121.0000 | 121.0000 | 121.0000 | 121.0000 | 107.0000 | 121.0000 | 121.0000 | 63.0000 | 54.0000 | 121.0000 | 83.0000 | 121.0000 | 121.0000 | 121.0000 | 121.0000 | 121.0000 |
| mean | 94.2680 | 173.7803 | 135.7637 | 49.2444 | 112.0500 | 101.6284 | 74.4656 | 28.4374 | 70.4888 | 31.7255 | 143.0460 | 31.5708 | 0.0119 | -0.0008 | -0.0007 | 0.0004 | 0.0020 |
| std | 22.9167 | 105.7000 | 39.2722 | 14.0322 | 60.6486 | 38.0144 | 30.7528 | 9.6530 | 16.9232 | 5.8871 | 66.0860 | 6.1456 | 0.0398 | 0.0265 | 0.0326 | 0.0007 | 0.0356 |
| min | 53.7700 | 66.7500 | 72.6800 | 27.6380 | 37.7999 | 51.3700 | 25.6200 | 14.7400 | 49.1260 | 24.2900 | 66.3100 | 22.6100 | -0.1338 | -0.0832 | -0.1402 | 0.0000 | -0.1245 |
| 25% | 79.2800 | 100.0000 | 111.2400 | 37.6800 | 68.2407 | 70.3999 | 51.3300 | 21.5800 | 58.3672 | 27.6383 | 99.6000 | 26.5850 | -0.0019 | -0.0189 | -0.0183 | 0.0000 | -0.0202 |
| 50% | 93.4700 | 133.0000 | 130.9300 | 47.7200 | 92.3100 | 95.6800 | 74.0300 | 27.3300 | 64.5300 | 29.0763 | 120.4700 | 31.2500 | 0.0143 | 0.0016 | -0.0038 | 0.0001 | 0.0036 |
| 75% | 107.7600 | 213.5600 | 154.6100 | 60.6300 | 137.7653 | 120.8150 | 90.5400 | 33.2200 | 81.4200 | 35.3800 | 166.1800 | 33.7900 | 0.0319 | 0.0140 | 0.0130 | 0.0009 | 0.0216 |
| max | 147.1100 | 458.1700 | 229.4900 | 80.9000 | 256.6500 | 190.6800 | 152.8800 | 53.2100 | 107.9000 | 44.8700 | 320.9000 | 46.5600 | 0.1365 | 0.0706 | 0.1279 | 0.0021 | 0.1006 |
The ETF with higher standard deviation is the VGT. And the factor with the lowest standard deviation is the SMB.
# To calculate the monthly growth
returns = df.pct_change()
# Database returns main stats
returns.describe()
| VTV | VGT | VB | USMV | PNQI | MTUM | XBI | SPHQ | ESGU | SECT | VUG | LRGF | ERP | SMB | HML | RF | MOM | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 120.0000 | 120.0000 | 120.0000 | 120.0000 | 120.0000 | 106.0000 | 120.0000 | 120.0000 | 62.0000 | 53.0000 | 120.0000 | 82.0000 | 120.0000 | 120.0000 | 120.0000 | 72.0000 | 120.0000 |
| mean | 0.0086 | 0.0157 | 0.0096 | 0.0087 | 0.0139 | 0.0116 | 0.0142 | 0.0107 | 0.0123 | 0.0102 | 0.0126 | 0.0078 | -2.8162 | -5.3380 | -0.4014 | inf | -0.1945 |
| std | 0.0387 | 0.0477 | 0.0487 | 0.0323 | 0.0564 | 0.0413 | 0.0854 | 0.0357 | 0.0459 | 0.0477 | 0.0427 | 0.0438 | 20.0015 | 38.6725 | 19.7084 | NaN | 17.7556 |
| min | -0.1558 | -0.1008 | -0.2224 | -0.1185 | -0.1169 | -0.1193 | -0.2799 | -0.0990 | -0.1296 | -0.1272 | -0.1083 | -0.1555 | -129.0000 | -417.0000 | -46.3333 | -1.0000 | -69.6667 |
| 25% | -0.0112 | -0.0126 | -0.0107 | -0.0099 | -0.0197 | -0.0112 | -0.0396 | -0.0089 | -0.0007 | -0.0111 | -0.0109 | -0.0107 | -1.8064 | -2.0933 | -1.9207 | -0.0132 | -2.0419 |
| 50% | 0.0125 | 0.0204 | 0.0126 | 0.0115 | 0.0176 | 0.0144 | 0.0142 | 0.0129 | 0.0171 | 0.0180 | 0.0163 | 0.0113 | -0.8480 | -1.0913 | -0.9273 | 0.0000 | -1.0816 |
| 75% | 0.0298 | 0.0448 | 0.0368 | 0.0312 | 0.0467 | 0.0368 | 0.0764 | 0.0336 | 0.0339 | 0.0375 | 0.0360 | 0.0313 | 0.1081 | -0.0290 | -0.1185 | 0.1295 | 0.0840 |
| max | 0.1276 | 0.1420 | 0.1578 | 0.0937 | 0.1667 | 0.1170 | 0.2060 | 0.1226 | 0.1329 | 0.1195 | 0.1508 | 0.1153 | 72.9091 | 14.7241 | 201.0000 | inf | 152.5000 |
The ETF with the higher simple return is VGT with 0.015731 (1.573%). And the ETF with the higher standard deviation is XBI with 0.0853. But also for the factors it looks to have a negative expected simple return with SMB with the highest expected lose.
# To create an index based on montlhy growth
df_index = returns.pct_change().fillna(0) \
.add(1).cumprod().mul(100).reset_index()
df_indexfactor = df.add(1).cumprod().mul(100).reset_index()
# Data exploratory - visualization of indexes base 100
df_index_plot = df_index.reset_index()
df_index_plot=pd.melt(df_index_plot, id_vars =['DATE'], value_vars =['VTV', 'VGT', 'VB', 'USMV', 'PNQI', 'MTUM', 'XBI', 'SPHQ', 'ESGU','SECT', 'VUG', 'LRGF'])
fig3 = px.line(df_index_plot, x="DATE", y="value", color='variable', title='ETFs growth index (base 100)', color_discrete_sequence=px.colors.qualitative.Bold)
fig3.show()
df_indexfactor_plot = df_indexfactor.reset_index()
df_indexfactor_plot=pd.melt(df_indexfactor_plot, id_vars =['DATE'], value_vars =['ERP', 'SMB', 'HML', 'RF', 'MOM'])
fig4 = px.line(df_indexfactor_plot, x="DATE", y="value", color='variable', title='Factors growth index (base 100)', color_discrete_sequence=px.colors.qualitative.Dark2)
fig4.show()
It is shown that VGT ETF present the highest volatility, but the momentum factor shows to give more positive returns than others, except for the market (ERP).
# To drop objets used
del [df_index_plot, df_index, fig3, df_indexfactor, df_indexfactor_plot, fig4, returns]
# Calculating the log returns for every variable in the dataset
logreturns = np.log(df) - np.log(df.shift(1))
# lnreturns = np.log1p(df.pct_change())
# To replace the columns avoiding the returns calculation for factors
columns=['ERP', 'SMB', 'HML', 'RF', 'MOM']
logreturns[columns] = df[columns]
del [columns] # To drop the aux object
logreturns = logreturns.iloc[1: , :] # To drop the first row of the dataframe
# Database returns main stats
logreturns.describe()
| VTV | VGT | VB | USMV | PNQI | MTUM | XBI | SPHQ | ESGU | SECT | VUG | LRGF | ERP | SMB | HML | RF | MOM | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 120.0000 | 120.0000 | 120.0000 | 120.0000 | 120.0000 | 106.0000 | 120.0000 | 120.0000 | 62.0000 | 53.0000 | 120.0000 | 82.0000 | 120.0000 | 120.0000 | 120.0000 | 120.0000 | 120.0000 |
| mean | 0.0078 | 0.0145 | 0.0084 | 0.0082 | 0.0122 | 0.0107 | 0.0105 | 0.0100 | 0.0112 | 0.0091 | 0.0116 | 0.0068 | 0.0117 | -0.0006 | -0.0007 | 0.0005 | 0.0021 |
| std | 0.0390 | 0.0473 | 0.0495 | 0.0324 | 0.0559 | 0.0412 | 0.0864 | 0.0356 | 0.0460 | 0.0481 | 0.0425 | 0.0443 | 0.0399 | 0.0266 | 0.0327 | 0.0007 | 0.0358 |
| min | -0.1694 | -0.1062 | -0.2515 | -0.1261 | -0.1244 | -0.1270 | -0.3284 | -0.1043 | -0.1388 | -0.1360 | -0.1146 | -0.1690 | -0.1338 | -0.0832 | -0.1402 | 0.0000 | -0.1245 |
| 25% | -0.0112 | -0.0127 | -0.0107 | -0.0100 | -0.0199 | -0.0113 | -0.0404 | -0.0089 | -0.0007 | -0.0112 | -0.0110 | -0.0108 | -0.0036 | -0.0191 | -0.0184 | 0.0000 | -0.0203 |
| 50% | 0.0124 | 0.0202 | 0.0125 | 0.0114 | 0.0174 | 0.0143 | 0.0141 | 0.0128 | 0.0169 | 0.0178 | 0.0161 | 0.0112 | 0.0140 | 0.0021 | -0.0039 | 0.0001 | 0.0038 |
| 75% | 0.0294 | 0.0438 | 0.0361 | 0.0307 | 0.0456 | 0.0361 | 0.0736 | 0.0331 | 0.0333 | 0.0368 | 0.0353 | 0.0309 | 0.0315 | 0.0144 | 0.0135 | 0.0009 | 0.0218 |
| max | 0.1201 | 0.1328 | 0.1465 | 0.0896 | 0.1542 | 0.1107 | 0.1873 | 0.1157 | 0.1248 | 0.1129 | 0.1405 | 0.1091 | 0.1365 | 0.0706 | 0.1279 | 0.0021 | 0.1006 |
The ETF with the higher log return is also VGT with 0.014505 (1.4505%). And the ETF with the higher standard deviation is XBI also with 0.086356. Now for the factors information it looks that ERP and Momentum give the highest expected returns.
# Returns Data exploratory - visualization
df_plot = logreturns.reset_index()
df_plot2=pd.melt(df_plot, id_vars =['DATE'], value_vars =['VTV', 'VGT', 'VB', 'USMV', 'PNQI', 'MTUM', 'XBI', 'SPHQ', 'ESGU','SECT', 'VUG', 'LRGF', 'ERP', 'SMB', 'HML', 'RF', 'MOM'])
fig = px.line(df_plot2, x="DATE", y="value", color='variable', title='ETFs log return values', color_discrete_sequence=px.colors.qualitative.Safe)
fig.show()
df_plot3=pd.melt(df_plot, id_vars =['DATE'], value_vars =['ERP', 'SMB', 'HML', 'RF', 'MOM'])
fig2 = px.line(df_plot3, x="DATE", y="value", color='variable', title='Factors log values', color_discrete_sequence=px.colors.qualitative.Vivid)
fig2.show()
There is visual evidence that the performance of the VGT ETF is pretty similar with the ERP factor behavior.
# To drop objets used
del [df_plot, fig, fig2, df_plot2, df_plot3]
# Data exploratory - visualization of return index base 100
df_indexfactor = logreturns.add(1).cumprod().mul(100).reset_index()
df_index_plot = df_indexfactor.reset_index()
df_index_plot=pd.melt(df_index_plot, id_vars =['DATE'], value_vars =['VTV', 'VGT', 'VB', 'USMV', 'PNQI', 'MTUM', 'XBI', 'SPHQ', 'ESGU','SECT', 'VUG', 'LRGF', 'ERP', 'SMB', 'HML', 'RF', 'MOM'])
fig3 = px.line(df_index_plot, x="DATE", y="value", color='variable', title='ETFs and Factors log return index (base 100)')
fig3.show()
But in the index analysis VGT factor gives better performance than any other ETF, even if it compare with factors returns.
# To drop objets used
del [df_index_plot, fig3, df_indexfactor]
1.1. To determine the statically significance of factors
# To calculate the mean by column
logreturns_mean=logreturns.mean(axis = 0, skipna = True)
# To calculate the std by column
logreturns_std=logreturns.std(axis = 0, skipna = True)
# To calculate the Z value by column
z_value = (logreturns_mean*math.sqrt(len(df.axes[0]))/logreturns_std)
# It is created a dataframe to get all the information in one object
frame = {'z_value': z_value, 'mean': logreturns_mean, 'std': logreturns_std}
result = pd.DataFrame(frame)
# Find p-value
z_pvalue = pd.DataFrame(scipy.stats.norm.sf(abs(result.z_value))*2)
result.reset_index(inplace=True)
result['zpvalue'] = z_pvalue # To add the p value of the Z score for every factor (including the ETFs)
result
| index | z_value | mean | std | zpvalue | |
|---|---|---|---|---|---|
| 0 | VTV | 2.2002 | 0.0078 | 0.0390 | 0.0278 |
| 1 | VGT | 3.3751 | 0.0145 | 0.0473 | 0.0007 |
| 2 | VB | 1.8579 | 0.0084 | 0.0495 | 0.0632 |
| 3 | USMV | 2.7753 | 0.0082 | 0.0324 | 0.0055 |
| 4 | PNQI | 2.4044 | 0.0122 | 0.0559 | 0.0162 |
| 5 | MTUM | 2.8521 | 0.0107 | 0.0412 | 0.0043 |
| 6 | XBI | 1.3317 | 0.0105 | 0.0864 | 0.1830 |
| 7 | SPHQ | 3.0880 | 0.0100 | 0.0356 | 0.0020 |
| 8 | ESGU | 2.6725 | 0.0112 | 0.0460 | 0.0075 |
| 9 | SECT | 2.0733 | 0.0091 | 0.0481 | 0.0381 |
| 10 | VUG | 3.0083 | 0.0116 | 0.0425 | 0.0026 |
| 11 | LRGF | 1.6874 | 0.0068 | 0.0443 | 0.0915 |
| 12 | ERP | 3.2168 | 0.0117 | 0.0399 | 0.0013 |
| 13 | SMB | -0.2554 | -0.0006 | 0.0266 | 0.7985 |
| 14 | HML | -0.2445 | -0.0007 | 0.0327 | 0.8068 |
| 15 | RF | 7.5307 | 0.0005 | 0.0007 | 0.0000 |
| 16 | MOM | 0.6431 | 0.0021 | 0.0358 | 0.5202 |
According to the p value observed in the last table the factors SMB, HML and MOM showed not to be statically different from zero. Which means that all the factors mentioned could not give additional information about the ETF (VTV, VGT, VB, USMV, PNQI, MTUM, XBI, SPHQ, ESGU, SECT, VUG and LRGF) returns in excess, except for the ERP. Also, there is evidence that the returns provident by the ETF are statistically different from zero. Regarding the signs of the factors, the ERP obtains the expected sign which relates the excess return provided by the market with respect to the market given a level of risk. However, for the SMB and HML factors, the expected sign is not obtained dude, according to Fama and French (2014), the size premium (Small Minus Big) is expected to be positive as well as the value premium (High Minus Low). In other words, small firms should obtain better returns (SMB) and firms with a high book-to-market ratio should provide higher retunr.
1.2. To calculate the returns in excess from each ETF.
# Calculating the return in excess for each ETF
logreturns['VTVn'] =logreturns['VTV'] -logreturns['RF']
logreturns['VGTn'] =logreturns['VGT'] -logreturns['RF']
logreturns['VBn'] =logreturns['VB'] -logreturns['RF']
logreturns['USMVn'] =logreturns['USMV'] -logreturns['RF']
logreturns['PNQIn'] =logreturns['PNQI'] -logreturns['RF']
logreturns['MTUMn'] =logreturns['MTUM'] -logreturns['RF']
logreturns['XBIn'] =logreturns['XBI'] -logreturns['RF']
logreturns['SPHQn'] =logreturns['SPHQ'] -logreturns['RF']
logreturns['ESGUn'] =logreturns['ESGU'] -logreturns['RF']
logreturns['SECTn'] =logreturns['SECT'] -logreturns['RF']
logreturns['VUGn'] =logreturns['VUG'] -logreturns['RF']
logreturns['LRGFn'] =logreturns['LRGF'] -logreturns['RF']
# Data exploratory - visualization of return indexes base 100
df_indexfactor = logreturns.add(1).cumprod().mul(100).reset_index()
df_index_plot = df_indexfactor.reset_index()
df_index_plot2=pd.melt(df_index_plot, id_vars =['DATE'], value_vars =['VTVn', 'VGTn', 'VBn', 'USMVn', 'PNQIn', 'MTUMn', 'XBIn', 'SPHQn', 'ESGUn', 'SECTn', 'VUGn', 'LRGFn'])
fig3 = px.line(df_index_plot2, x="DATE", y="value", color='variable', title='ETFs return in excess index (base 100)')
fig3.show()
df_index_plot3=pd.melt(df_index_plot, id_vars =['DATE'], value_vars =['VTV', 'VGT', 'VB', 'USMV', 'PNQI', 'MTUM', 'XBI', 'SPHQ', 'ESGU','SECT', 'VUG', 'LRGF'])
fig4 = px.line(df_index_plot3, x="DATE", y="value", color='variable', title='ETFs return index (base 100)')
fig4.show()
It is noticed that even with the returns in excess the VGT factor provides the best performance if we suppose the initial investment of USD 100 by the beginning of the time. It is shown that VTVn, VGTn, USMVn, PNQIn, MTUMn, SPHQn, ESGUn and VUGn ETF returns in excess are statically diferent from zero, but for VBn, XBIn, SECTn and LRGFn it is shown that the excess' returns are not statically different form zero. When calculating the excess returns of each of the ETFs, no major difference is found in their evaluation over time. That is, assuming that USD 100 is invested in each of the ETFs at the beginning of the observation period the evaluation of the investment in each ETF does not change significantly as can be seen in Graph 1.
# To calculate the mean by column
logreturns_mean=logreturns.mean(axis = 0, skipna = True)
# To calculate the std by column
logreturns_std=logreturns.std(axis = 0, skipna = True)
# To calculate the Z value by column
z_value = (logreturns_mean*math.sqrt(len(df.axes[0]))/logreturns_std)
# It is created a dataframe to get all the information in one object
frame = {'z_value': z_value, 'mean': logreturns_mean, 'std': logreturns_std}
result = pd.DataFrame(frame)
# Find p-value
z_pvalue = pd.DataFrame(scipy.stats.norm.sf(abs(result.z_value))*2)
result.reset_index(inplace=True)
result['zpvalue'] = z_pvalue # To add the p value of the Z score for every factor (including the ETFs)
result
| index | z_value | mean | std | zpvalue | |
|---|---|---|---|---|---|
| 0 | VTV | 2.2002 | 0.0078 | 0.0390 | 0.0278 |
| 1 | VGT | 3.3751 | 0.0145 | 0.0473 | 0.0007 |
| 2 | VB | 1.8579 | 0.0084 | 0.0495 | 0.0632 |
| 3 | USMV | 2.7753 | 0.0082 | 0.0324 | 0.0055 |
| 4 | PNQI | 2.4044 | 0.0122 | 0.0559 | 0.0162 |
| 5 | MTUM | 2.8521 | 0.0107 | 0.0412 | 0.0043 |
| 6 | XBI | 1.3317 | 0.0105 | 0.0864 | 0.1830 |
| 7 | SPHQ | 3.0880 | 0.0100 | 0.0356 | 0.0020 |
| 8 | ESGU | 2.6725 | 0.0112 | 0.0460 | 0.0075 |
| 9 | SECT | 2.0733 | 0.0091 | 0.0481 | 0.0381 |
| 10 | VUG | 3.0083 | 0.0116 | 0.0425 | 0.0026 |
| 11 | LRGF | 1.6874 | 0.0068 | 0.0443 | 0.0915 |
| 12 | ERP | 3.2168 | 0.0117 | 0.0399 | 0.0013 |
| 13 | SMB | -0.2554 | -0.0006 | 0.0266 | 0.7985 |
| 14 | HML | -0.2445 | -0.0007 | 0.0327 | 0.8068 |
| 15 | RF | 7.5307 | 0.0005 | 0.0007 | 0.0000 |
| 16 | MOM | 0.6431 | 0.0021 | 0.0358 | 0.5202 |
| 17 | VTVn | 2.0668 | 0.0073 | 0.0391 | 0.0388 |
| 18 | VGTn | 3.2679 | 0.0141 | 0.0473 | 0.0011 |
| 19 | VBn | 1.7536 | 0.0079 | 0.0496 | 0.0795 |
| 20 | USMVn | 2.6173 | 0.0077 | 0.0325 | 0.0089 |
| 21 | PNQIn | 2.3131 | 0.0118 | 0.0560 | 0.0207 |
| 22 | MTUMn | 2.7124 | 0.0102 | 0.0413 | 0.0067 |
| 23 | XBIn | 1.2731 | 0.0100 | 0.0864 | 0.2030 |
| 24 | SPHQn | 2.9405 | 0.0095 | 0.0357 | 0.0033 |
| 25 | ESGUn | 2.4656 | 0.0103 | 0.0461 | 0.0137 |
| 26 | SECTn | 1.8663 | 0.0082 | 0.0483 | 0.0620 |
| 27 | VUGn | 2.8874 | 0.0112 | 0.0425 | 0.0039 |
| 28 | LRGFn | 1.5204 | 0.0061 | 0.0444 | 0.1284 |
# To drop objets used
del [logreturns_mean, logreturns_std, z_pvalue, z_value, df_index_plot, fig3, df_indexfactor, fig4, df_index_plot2, df_index_plot3, frame]
1.3. Evaluating the ETF performance
# To calculate the mean by column
logreturns_mean=logreturns.mean(axis = 0, skipna = True)
# To calculate the std by column
logreturns_std=logreturns.std(axis = 0, skipna = True)
# It is created a dataframe to get all the information in one object
frame = {'ER': logreturns_mean, 'SIGMA': logreturns_std} # Expected returns (EP)
result = pd.DataFrame(frame)
result['SHARPE'] = result['ER'] / result['SIGMA']
result
| ER | SIGMA | SHARPE | |
|---|---|---|---|
| VTV | 0.0078 | 0.0390 | 0.2000 |
| VGT | 0.0145 | 0.0473 | 0.3068 |
| VB | 0.0084 | 0.0495 | 0.1689 |
| USMV | 0.0082 | 0.0324 | 0.2523 |
| PNQI | 0.0122 | 0.0559 | 0.2186 |
| MTUM | 0.0107 | 0.0412 | 0.2593 |
| XBI | 0.0105 | 0.0864 | 0.1211 |
| SPHQ | 0.0100 | 0.0356 | 0.2807 |
| ESGU | 0.0112 | 0.0460 | 0.2430 |
| SECT | 0.0091 | 0.0481 | 0.1885 |
| VUG | 0.0116 | 0.0425 | 0.2735 |
| LRGF | 0.0068 | 0.0443 | 0.1534 |
| ERP | 0.0117 | 0.0399 | 0.2924 |
| SMB | -0.0006 | 0.0266 | -0.0232 |
| HML | -0.0007 | 0.0327 | -0.0222 |
| RF | 0.0005 | 0.0007 | 0.6846 |
| MOM | 0.0021 | 0.0358 | 0.0585 |
| VTVn | 0.0073 | 0.0391 | 0.1879 |
| VGTn | 0.0141 | 0.0473 | 0.2971 |
| VBn | 0.0079 | 0.0496 | 0.1594 |
| USMVn | 0.0077 | 0.0325 | 0.2379 |
| PNQIn | 0.0118 | 0.0560 | 0.2103 |
| MTUMn | 0.0102 | 0.0413 | 0.2466 |
| XBIn | 0.0100 | 0.0864 | 0.1157 |
| SPHQn | 0.0095 | 0.0357 | 0.2673 |
| ESGUn | 0.0103 | 0.0461 | 0.2241 |
| SECTn | 0.0082 | 0.0483 | 0.1697 |
| VUGn | 0.0112 | 0.0425 | 0.2625 |
| LRGFn | 0.0061 | 0.0444 | 0.1382 |
It is noticed that the ETF VGT presents a higher Sharpe Ratio than the factor ERP. This contributes to the exploratory analysis affirming that the VGT present a better performance.
# In the last table it is shown that the Expected RF in equal to 0.000453 (row 15 and column 1)
result['RF'] = result.iloc[15]['ER']
# result['RF'] = 0.0004533333333333336
# To get all the Jessen's Alphas into a data frame it is created an empty dataframe
jalpha_df = pd.DataFrame()
# List of ETF
ETF_list = ['VTVn', 'VGTn', 'VBn', 'USMVn', 'PNQIn', 'MTUMn', 'XBIn', 'SPHQn', 'ESGUn', 'SECTn', 'VUGn', 'LRGFn']
# list of models
models = []
for resp in ETF_list:
formula = resp + " ~ ERP"
models.append(lm.OLS.from_formula(formula, data = logreturns, hasconst=bool).fit())
# To extrac every parameter from models
jalpha_df['VTV']=models[0].params
jalpha_df['VGT']=models[1].params
jalpha_df['VB']=models[2].params
jalpha_df['USMV']=models[3].params
jalpha_df['PNQI']=models[4].params
jalpha_df['MTUM']=models[5].params
jalpha_df['XBI']=models[6].params
jalpha_df['SPHQ']=models[7].params
jalpha_df['ESGU']=models[8].params
jalpha_df['SECT']=models[9].params
jalpha_df['VUG']=models[10].params
jalpha_df['LRGF']=models[11].params
# print(models[11].summary())
# To get all the Jessen's Alphas pvalues into a data frame it is created an empty dataframe
pjalpha_df = pd.DataFrame()
# To extrac every parameter p value from models
pjalpha_df['VTV']=models[0].pvalues
pjalpha_df['VGT']=models[1].pvalues
pjalpha_df['VB']=models[2].pvalues
pjalpha_df['USMV']=models[3].pvalues
pjalpha_df['PNQI']=models[4].pvalues
pjalpha_df['MTUM']=models[5].pvalues
pjalpha_df['XBI']=models[6].pvalues
pjalpha_df['SPHQ']=models[7].pvalues
pjalpha_df['ESGU']=models[8].pvalues
pjalpha_df['SECT']=models[9].pvalues
pjalpha_df['VUG']=models[10].pvalues
pjalpha_df['LRGF']=models[11].pvalues
# To transpose the dataset
pjalpha_df = pjalpha_df.T
pjalpha_df.rename(columns = {'Intercept':'ALPHA_CAPM', 'ERP':'BETA_CAPM'}, inplace = True) # To rename the columns
# To get all the Jessen's Alphas R2 into a data frame it is created an empty dataframe
# To extrac every R2 from models
# Initializing list of lists
data = [['VTV',models[0].rsquared], ['VGT',models[1].rsquared], ['VB',models[2].rsquared], ['USMV',models[3].rsquared], ['PNQI',models[4].rsquared], ['MTUM',models[5].rsquared], ['XBI',models[6].rsquared], ['SPHQ',models[7].rsquared], ['ESGU',models[8].rsquared], ['SECT',models[9].rsquared], ['VUG',models[10].rsquared], ['LRGF',models[11].rsquared]]
# Creating the pandas DataFrame with the R2 for every model
models_r2 = pd.DataFrame(data, columns = ['index', 'CAPM'])
del [data]
# To get all the Jessen's Alphas R2 into a data frame it is created an empty dataframe
# To extrac every R2 from models
# Initializing list of lists
data = [['VTV',models[0].rsquared_adj], ['VGT',models[1].rsquared_adj], ['VB',models[2].rsquared_adj], ['USMV',models[3].rsquared_adj], ['PNQI',models[4].rsquared_adj], ['MTUM',models[5].rsquared_adj], ['XBI',models[6].rsquared_adj], ['SPHQ',models[7].rsquared_adj], ['ESGU',models[8].rsquared_adj], ['SECT',models[9].rsquared_adj], ['VUG',models[10].rsquared_adj], ['LRGF',models[11].rsquared_adj]]
# Creating the pandas DataFrame with the R2 for every model
models_r2a = pd.DataFrame(data, columns = ['index', 'CAPM'])
del [data]
# To clean the memory
del[formula, frame, models, resp, ETF_list, logreturns_std, logreturns_mean]
# To transpose the dataset
jalpha_df = jalpha_df.T
jalpha_df.rename(columns = {'Intercept':'ALPHA_CAPM', 'ERP':'BETA_CAPM'}, inplace = True) # To rename the columns
# Joining the result dataframe and the Jesen's Alpha dataframe to have a sinlge dataframe of results
result.reset_index(inplace=True)
jalpha_df.reset_index(inplace=True)
result = pd.merge(result, jalpha_df, how='left', left_on=['index'], right_on=['index'])
result = result.dropna(subset=['ALPHA_CAPM']) #Droping NA in the CAPM column
del [jalpha_df]
Treynor ratio
# Calculating the Treynor ratio
result['TR'] = (result['ER'] - result['RF']) / result['BETA_CAPM']
Fama and French 3 factors model
# To get all the Jessen's Alphas from the 3 factors FF model into a data frame it is created an empty dataframe
jalpha_df = pd.DataFrame()
# List of ETF
ETF_list = ['VTVn', 'VGTn', 'VBn', 'USMVn', 'PNQIn', 'MTUMn', 'XBIn', 'SPHQn', 'ESGUn', 'SECTn', 'VUGn', 'LRGFn']
# list of models
models = []
for resp in ETF_list:
formula = resp + " ~ ERP + SMB + HML"
models.append(lm.OLS.from_formula(formula, data = logreturns, hasconst=bool).fit())
# To extrac every parameter from models
jalpha_df['VTV']=models[0].params
jalpha_df['VGT']=models[1].params
jalpha_df['VB']=models[2].params
jalpha_df['USMV']=models[3].params
jalpha_df['PNQI']=models[4].params
jalpha_df['MTUM']=models[5].params
jalpha_df['XBI']=models[6].params
jalpha_df['SPHQ']=models[7].params
jalpha_df['ESGU']=models[8].params
jalpha_df['SECT']=models[9].params
jalpha_df['VUG']=models[10].params
jalpha_df['LRGF']=models[11].params
# print(models[11].summary())
# To reshape DataFrame from wide format to long format
jalpha_df.reset_index(inplace=True)
jalpha_df.drop([1,2,3], axis=0, inplace=True)
jalpha_df = pd.melt(jalpha_df, id_vars='index', value_vars=['VTV', 'VGT', 'VB', 'USMV', 'PNQI', 'MTUM', 'XBI', 'SPHQ', 'ESGU', 'SECT', 'VUG', 'LRGF'])
del jalpha_df['index']
jalpha_df.rename(columns = {'value':'3FF', 'variable':'index'}, inplace = True) # To rename the columns
# Joining the result dataframe and the Jesen's Alpha dataframe to have a sinlge dataframe of results
result = pd.merge(result, jalpha_df, how='left', left_on=['index'], right_on=['index'])
# To get all the Jessen's Alphas pvalues into a data frame it is created an empty dataframe
pjalpha_df2 = pd.DataFrame()
# To extrac every parameter p value from models
pjalpha_df2['VTV']=models[0].pvalues
pjalpha_df2['VGT']=models[1].pvalues
pjalpha_df2['VB']=models[2].pvalues
pjalpha_df2['USMV']=models[3].pvalues
pjalpha_df2['PNQI']=models[4].pvalues
pjalpha_df2['MTUM']=models[5].pvalues
pjalpha_df2['XBI']=models[6].pvalues
pjalpha_df2['SPHQ']=models[7].pvalues
pjalpha_df2['ESGU']=models[8].pvalues
pjalpha_df2['SECT']=models[9].pvalues
pjalpha_df2['VUG']=models[10].pvalues
pjalpha_df2['LRGF']=models[11].pvalues
# To transpose the dataset
pjalpha_df2 = pjalpha_df2.T
pjalpha_df2.rename(columns = {'Intercept':'ALPHA_3FF', 'ERP':'BETA_3FF', 'SMB':'SMB_3FF', 'HML':'HML_3FF'}, inplace = True) # To rename the columns
pjalpha_df.reset_index(inplace=True)
pjalpha_df2.reset_index(inplace=True)
# Joining the result dataframe and the Jesen's Alpha dataframe to have a sinlge dataframe of results
pjalpha_df3 = pd.merge(pjalpha_df, pjalpha_df2, how='outer', left_on=['index'], right_on=['index'])
# To get all the Jessen's Alphas R2 into a data frame it is created an empty dataframe
# To extrac every R2 from models
# Initializing list of lists
data = [['VTV',models[0].rsquared], ['VGT',models[1].rsquared], ['VB',models[2].rsquared], ['USMV',models[3].rsquared], ['PNQI',models[4].rsquared], ['MTUM',models[5].rsquared], ['XBI',models[6].rsquared], ['SPHQ',models[7].rsquared], ['ESGU',models[8].rsquared], ['SECT',models[9].rsquared], ['VUG',models[10].rsquared], ['LRGF',models[11].rsquared]]
# Creating the pandas DataFrame with the R2 for every model
models3FF_r2 = pd.DataFrame(data, columns = ['index', '3FF'])
# Joining the result dataframe and the Jesen's Alpha dataframe to have a sinlge dataframe of results
models_r2 = pd.merge(models_r2, models3FF_r2, how='outer', left_on=['index'], right_on=['index'])
# To get all the Jessen's Alphas R2 into a data frame it is created an empty dataframe
# To extrac every R2 from models
# Initializing list of lists
data = [['VTV',models[0].rsquared_adj], ['VGT',models[1].rsquared_adj], ['VB',models[2].rsquared_adj], ['USMV',models[3].rsquared_adj], ['PNQI',models[4].rsquared_adj], ['MTUM',models[5].rsquared_adj], ['XBI',models[6].rsquared_adj], ['SPHQ',models[7].rsquared_adj], ['ESGU',models[8].rsquared_adj], ['SECT',models[9].rsquared_adj], ['VUG',models[10].rsquared_adj], ['LRGF',models[11].rsquared_adj]]
# Creating the pandas DataFrame with the R2 for every model
models3FF_r2a = pd.DataFrame(data, columns = ['index', '3FF'])
# Joining the result dataframe and the Jesen's Alpha dataframe to have a sinlge dataframe of results
models_r2a = pd.merge(models_r2a, models3FF_r2a, how='outer', left_on=['index'], right_on=['index'])
# To clean the memory
del[data, pjalpha_df, pjalpha_df2, models3FF_r2, jalpha_df, models, resp, ETF_list, formula]
Fama and French 3 factors model including momentum
# To get all the Jessen's Alphas from the 3 factors + MOM FF model into a data frame it is created an empty dataframe
jalpha_df = pd.DataFrame()
# List of ETF
ETF_list = ['VTVn', 'VGTn', 'VBn', 'USMVn', 'PNQIn', 'MTUMn', 'XBIn', 'SPHQn', 'ESGUn', 'SECTn', 'VUGn', 'LRGFn']
# list of models
models = []
for resp in ETF_list:
formula = resp + " ~ ERP + SMB + HML + MOM"
models.append(lm.OLS.from_formula(formula, data = logreturns, hasconst=bool).fit())
# To extrac every parameter from models
jalpha_df['VTV']=models[0].params
jalpha_df['VGT']=models[1].params
jalpha_df['VB']=models[2].params
jalpha_df['USMV']=models[3].params
jalpha_df['PNQI']=models[4].params
jalpha_df['MTUM']=models[5].params
jalpha_df['XBI']=models[6].params
jalpha_df['SPHQ']=models[7].params
jalpha_df['ESGU']=models[8].params
jalpha_df['SECT']=models[9].params
jalpha_df['VUG']=models[10].params
jalpha_df['LRGF']=models[11].params
# print(models[11].summary())
# To reshape DataFrame from wide format to long format
jalpha_df.reset_index(inplace=True)
jalpha_df.drop([1,2,3,4], axis=0, inplace=True)
jalpha_df = pd.melt(jalpha_df, id_vars='index', value_vars=['VTV', 'VGT', 'VB', 'USMV', 'PNQI', 'MTUM', 'XBI', 'SPHQ', 'ESGU', 'SECT', 'VUG', 'LRGF'])
del jalpha_df['index']
jalpha_df.rename(columns = {'value':'3MFF', 'variable':'index'}, inplace = True) # To rename the columns
# Joining the result dataframe and the Jesen's Alpha dataframe to have a sinlge dataframe of results
result = pd.merge(result, jalpha_df, how='left', left_on=['index'], right_on=['index'])
# To get all the Jessen's Alphas pvalues into a data frame it is created an empty dataframe
pjalpha_df2 = pd.DataFrame()
# To extrac every parameter p value from models
pjalpha_df2['VTV']=models[0].pvalues
pjalpha_df2['VGT']=models[1].pvalues
pjalpha_df2['VB']=models[2].pvalues
pjalpha_df2['USMV']=models[3].pvalues
pjalpha_df2['PNQI']=models[4].pvalues
pjalpha_df2['MTUM']=models[5].pvalues
pjalpha_df2['XBI']=models[6].pvalues
pjalpha_df2['SPHQ']=models[7].pvalues
pjalpha_df2['ESGU']=models[8].pvalues
pjalpha_df2['SECT']=models[9].pvalues
pjalpha_df2['VUG']=models[10].pvalues
pjalpha_df2['LRGF']=models[11].pvalues
# To transpose the dataset
pjalpha_df2 = pjalpha_df2.T
pjalpha_df2.rename(columns = {'Intercept':'ALPHA_3MFF', 'ERP':'BETA_3MFF', 'SMB':'SMB_3MFF', 'HML':'HML_3MFF', 'MOM':'MOM_3MFF'}, inplace = True) # To rename the columns
pjalpha_df2.reset_index(inplace=True)
# Joining the result dataframe and the Jesen's Alpha dataframe to have a sinlge dataframe of results
pjalpha_df = pd.merge(pjalpha_df3, pjalpha_df2, how='outer', left_on=['index'], right_on=['index'])
del [pjalpha_df3, pjalpha_df2]
# To get all the Jessen's Alphas R2 into a data frame it is created an empty dataframe
# To extrac every R2 from models
# Initializing list of lists
data = [['VTV',models[0].rsquared], ['VGT',models[1].rsquared], ['VB',models[2].rsquared], ['USMV',models[3].rsquared], ['PNQI',models[4].rsquared], ['MTUM',models[5].rsquared], ['XBI',models[6].rsquared], ['SPHQ',models[7].rsquared], ['ESGU',models[8].rsquared], ['SECT',models[9].rsquared], ['VUG',models[10].rsquared], ['LRGF',models[11].rsquared]]
# Creating the pandas DataFrame with the R2 for every model
models3FF_r2 = pd.DataFrame(data, columns = ['index', '3MFF'])
# Joining the result dataframe and the Jesen's Alpha dataframe to have a sinlge dataframe of results
models_r2 = pd.merge(models_r2, models3FF_r2, how='outer', left_on=['index'], right_on=['index'])
# To get all the Jessen's Alphas R2 into a data frame it is created an empty dataframe
# To extrac every R2 from models
# Initializing list of lists
data = [['VTV',models[0].rsquared_adj], ['VGT',models[1].rsquared_adj], ['VB',models[2].rsquared_adj], ['USMV',models[3].rsquared_adj], ['PNQI',models[4].rsquared_adj], ['MTUM',models[5].rsquared_adj], ['XBI',models[6].rsquared_adj], ['SPHQ',models[7].rsquared_adj], ['ESGU',models[8].rsquared_adj], ['SECT',models[9].rsquared_adj], ['VUG',models[10].rsquared_adj], ['LRGF',models[11].rsquared_adj]]
# Creating the pandas DataFrame with the R2 for every model
models3FF_r2a = pd.DataFrame(data, columns = ['index', '3MFF'])
# Joining the result dataframe and the Jesen's Alpha dataframe to have a sinlge dataframe of results
models_r2a = pd.merge(models_r2a, models3FF_r2a, how='outer', left_on=['index'], right_on=['index'])
# To clean the memory
del[jalpha_df, models, resp, ETF_list, formula]
# Performance measures - visualization
fig = px.bar(result, x="index", y="SHARPE", title='Sharpe ratio for ETF')
fig.show()
fig3 = px.bar(result, x="index", y="TR", title='Treynor ratio for ETF', color_discrete_sequence=px.colors.qualitative.Pastel1)
fig3.show()
fig2 = px.bar(result, x="index", y="ALPHA_CAPM", title='Jensen Alpha for ETF', color_discrete_sequence=px.colors.qualitative.Set3)
fig2.show()
fig4 = px.bar(result, x="index", y="3FF", title='Fama and French 3 factors Alpha for ETF', color_discrete_sequence=px.colors.qualitative.Set1)
fig4.show()
fig5 = px.bar(result, x="index", y="3MFF", title='Fama and French 3 factors + momentum Alpha for ETF', color_discrete_sequence=px.colors.qualitative.Dark24)
fig5.show()
# To analize the portfolio performance measures in a table
result
| index | ER | SIGMA | SHARPE | RF | ALPHA_CAPM | BETA_CAPM | TR | 3FF | 3MFF | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | VTV | 0.0078 | 0.0390 | 0.2000 | 0.0005 | -0.0034 | 0.9202 | 0.0080 | -0.0034 | -0.0033 |
| 1 | VGT | 0.0145 | 0.0473 | 0.3068 | 0.0005 | 0.0016 | 1.0636 | 0.0132 | 0.0007 | 0.0007 |
| 2 | VB | 0.0084 | 0.0495 | 0.1689 | 0.0005 | -0.0056 | 1.1586 | 0.0068 | -0.0036 | -0.0040 |
| 3 | USMV | 0.0082 | 0.0324 | 0.2523 | 0.0005 | -0.0005 | 0.7074 | 0.0109 | -0.0014 | -0.0020 |
| 4 | PNQI | 0.0122 | 0.0559 | 0.2186 | 0.0005 | -0.0011 | 1.1062 | 0.0106 | -0.0013 | -0.0013 |
| 5 | MTUM | 0.0107 | 0.0412 | 0.2593 | 0.0005 | -0.0002 | 0.9040 | 0.0113 | -0.0012 | -0.0023 |
| 6 | XBI | 0.0105 | 0.0864 | 0.1211 | 0.0005 | -0.0059 | 1.3590 | 0.0074 | -0.0020 | -0.0031 |
| 7 | SPHQ | 0.0100 | 0.0356 | 0.2807 | 0.0005 | -0.0005 | 0.8598 | 0.0111 | -0.0009 | -0.0011 |
| 8 | ESGU | 0.0112 | 0.0460 | 0.2430 | 0.0005 | -0.0018 | 0.9748 | 0.0110 | -0.0022 | -0.0022 |
| 9 | SECT | 0.0091 | 0.0481 | 0.1885 | 0.0005 | -0.0030 | 0.9257 | 0.0093 | -0.0026 | -0.0029 |
| 10 | VUG | 0.0116 | 0.0425 | 0.2735 | 0.0005 | -0.0007 | 1.0154 | 0.0110 | -0.0015 | -0.0016 |
| 11 | LRGF | 0.0068 | 0.0443 | 0.1534 | 0.0005 | -0.0044 | 0.9709 | 0.0065 | -0.0040 | -0.0043 |
It can be seen, taking the decision according to the Sharpe Ratio and Treynor criteria, the best performing portfolio is VGT, as well as taking the decision based on Jensen's Alpha, Alpha derived from the three-factor model and the three-factor plus momentum model of Fama and French (2014), the ETF VGT presents the best performance compared to the other portfolios. A particularity in this exercise is that the ETF VGT is the only one that wins in all the portfolio performance measures, since when analyzing the other indexes the behavior is varied, for example, taking the investment decision according to the Sharpe Ratio the second best performance was presented by the SPHQ while guided by the Treynor Ratio the second portfolio is the MTUM as well as taking Jensen's Alpha as a performance measure. When analyzing the Alpha of the three-factor model of Fama and French (2014) the SPHQ portfolio records the lowest value destruction being the second best portfolio in this measure, as occurs when we rely on the three-factor plus momentum model of Fama and French (2014). It is noticed that the ETF VGT presents a higher Sharpe Ratio than the factor ERP. This contributes to the exploratory analysis affirming that the VGT present a better performance.
# To analize the pvalues for all the estimators from models
pjalpha_df
| index | ALPHA_CAPM | BETA_CAPM | ALPHA_3FF | BETA_3FF | SMB_3FF | HML_3FF | ALPHA_3MFF | BETA_3MFF | SMB_3MFF | HML_3MFF | MOM_3MFF | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | VTV | 0.0093 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.7643 |
| 1 | VGT | 0.4125 | 0.0000 | 0.6827 | 0.0000 | 0.0157 | 0.0000 | 0.6866 | 0.0000 | 0.0169 | 0.0000 | 0.9888 |
| 2 | VB | 0.0015 | 0.0000 | 0.0002 | 0.0000 | 0.0000 | 0.0119 | 0.0001 | 0.0000 | 0.0000 | 0.0011 | 0.0279 |
| 3 | USMV | 0.7299 | 0.0000 | 0.3098 | 0.0000 | 0.0000 | 0.6390 | 0.1526 | 0.0000 | 0.0000 | 0.4436 | 0.0169 |
| 4 | PNQI | 0.7298 | 0.0000 | 0.6539 | 0.0000 | 0.1128 | 0.0000 | 0.6631 | 0.0000 | 0.1176 | 0.0000 | 0.9809 |
| 5 | MTUM | 0.9261 | 0.0000 | 0.4349 | 0.0000 | 0.0736 | 0.0000 | 0.0567 | 0.0000 | 0.1747 | 0.0019 | 0.0000 |
| 6 | XBI | 0.3642 | 0.0000 | 0.6829 | 0.0000 | 0.0000 | 0.0000 | 0.5325 | 0.0000 | 0.0000 | 0.0000 | 0.1973 |
| 7 | SPHQ | 0.5983 | 0.0000 | 0.3180 | 0.0000 | 0.0006 | 0.8587 | 0.2206 | 0.0000 | 0.0011 | 0.4009 | 0.1635 |
| 8 | ESGU | 0.0097 | 0.0000 | 0.0002 | 0.0000 | 0.0000 | 0.0076 | 0.0001 | 0.0000 | 0.0000 | 0.0064 | 0.4036 |
| 9 | SECT | 0.0706 | 0.0000 | 0.1187 | 0.0000 | 0.4221 | 0.0897 | 0.0589 | 0.0000 | 0.1416 | 0.0043 | 0.0071 |
| 10 | VUG | 0.5833 | 0.0000 | 0.0488 | 0.0000 | 0.0000 | 0.0000 | 0.0402 | 0.0000 | 0.0000 | 0.0000 | 0.4912 |
| 11 | LRGF | 0.0006 | 0.0000 | 0.0006 | 0.0000 | 0.5518 | 0.0002 | 0.0003 | 0.0000 | 0.3519 | 0.0001 | 0.1276 |
As can be seen in the before table, the null hypothesis of no statistical difference from zero is rejected for the betas in all models, the beta does not provide valuable information regardless of the model that explains the excess returns of the ETF. Regarding Jensen's alpha in the CAPM, it is found that this factor provides relevant information to explain the excess returns of the ETFs: VGT, USMV, PNQI, MTUM, XBI, SPHQ, SECT and VUG. Now, analyzing the three-factor model of Fama and French (2014) it is found that Alpha provides valuable information for VGT, USMV, PNQI, MTUM, XBI, SPHQ and SECT. Regarding the SMB factor, it was found significantly different from zero for the ETFs PNQI, MTUM, SECT and LRGF. The HML factor only provided information to explain the yields of USMV, SPHQ and SECT. When analyzing the statistical significance of the factors proposed by the three-factor model of Fama and French (2014) plus the momentum factor, it was found that Alpha provided valuable information for VGT, USMV, PNQI, MTUM, XBI, SPHQ and SECT. Regarding the SMB factor, it was only relevant for PNQI, MTUM, SECT and LRGF. Now for the HML factor this should only be taken into account to explain the excess returns of USMV and SPHQ. Finally, the momentum factor was determinant in explaining the returns of the ETFs VTV, VGT, PNQI, XBI, SPHQ, ESGU, VUG and LRGF.
# R2 measures - visualization
fig = px.bar(models_r2, x="index", y="CAPM", title='CAPM R2 for ETF', color_discrete_sequence=px.colors.qualitative.Antique)
fig.show()
fig2 = px.bar(models_r2, x="index", y="3FF", title='3 Factors model FF R2 for ETF', color_discrete_sequence=px.colors.qualitative.Pastel1)
fig2.show()
fig3 = px.bar(models_r2, x="index", y="3MFF", title='3 Factors model FF with Momentum R2 for ETF', color_discrete_sequence=px.colors.qualitative.Dark24)
fig3.show()
# Data in table for R2 of every model
models_r2
| index | CAPM | 3FF | 3MFF | |
|---|---|---|---|---|
| 0 | VTV | 0.8820 | 0.9605 | 0.9605 |
| 1 | VGT | 0.8050 | 0.8658 | 0.8658 |
| 2 | VB | 0.8674 | 0.9614 | 0.9630 |
| 3 | USMV | 0.7560 | 0.8010 | 0.8107 |
| 4 | PNQI | 0.6219 | 0.7188 | 0.7188 |
| 5 | MTUM | 0.8145 | 0.8799 | 0.9248 |
| 6 | XBI | 0.3938 | 0.6703 | 0.6751 |
| 7 | SPHQ | 0.9246 | 0.9322 | 0.9334 |
| 8 | ESGU | 0.9873 | 0.9925 | 0.9926 |
| 9 | SECT | 0.9433 | 0.9484 | 0.9557 |
| 10 | VUG | 0.9071 | 0.9651 | 0.9652 |
| 11 | LRGF | 0.9412 | 0.9527 | 0.9541 |
# R2 adjusted measures - visualization
fig = px.bar(models_r2a, x="index", y="CAPM", title='CAPM R2 adjusted for ETF', color_discrete_sequence=px.colors.qualitative.Vivid)
fig.show()
fig2 = px.bar(models_r2a, x="index", y="3FF", title='3 Factors model FF R2 adjusted for ETF', color_discrete_sequence=px.colors.qualitative.Pastel2)
fig2.show()
fig3 = px.bar(models_r2a, x="index", y="3MFF", title='3 Factors model FF with Momentum R2 adjusted for ETF', color_discrete_sequence=px.colors.qualitative.Set1)
fig3.show()
# Data in table for R2 of every model
models_r2a
| index | CAPM | 3FF | 3MFF | |
|---|---|---|---|---|
| 0 | VTV | 0.8810 | 0.9595 | 0.9591 |
| 1 | VGT | 0.8034 | 0.8623 | 0.8611 |
| 2 | VB | 0.8663 | 0.9604 | 0.9618 |
| 3 | USMV | 0.7539 | 0.7959 | 0.8041 |
| 4 | PNQI | 0.6187 | 0.7116 | 0.7091 |
| 5 | MTUM | 0.8127 | 0.8764 | 0.9218 |
| 6 | XBI | 0.3886 | 0.6618 | 0.6638 |
| 7 | SPHQ | 0.9239 | 0.9305 | 0.9311 |
| 8 | ESGU | 0.9870 | 0.9921 | 0.9920 |
| 9 | SECT | 0.9422 | 0.9452 | 0.9520 |
| 10 | VUG | 0.9063 | 0.9642 | 0.9640 |
| 11 | LRGF | 0.9405 | 0.9509 | 0.9517 |
Regarding the goodness of fit measures R2 and adjusted R2 of the estimated models, as shown in Figures 3 and 4, for the ESGU ETF the CAPM model presented the best fit according to R2 (0.9873). The same occurs for the 3-factor model of Fama and French (2014) where ESGU presents the highest R2 value (0.9925), even when including the momentum factor (R2 of 0.9926). This same situation is presented when the goodness-of-fit measure is penalized by the number of variables included in the model, for the CAPM model presented the best fit for ESGU (adjusted R2 0.9873), as well as for the 3-factor model of Fama and French (2014) (adjusted R2 of 0.9925 for the ESGU factor). Finally, the Fama and French (2014) plus momentum 3-factor model presented the best fit for the ESGU ETF (adjusted R2 0.9926). The ETF that obtained the lowest degree of fit in all the estimated models was the XBI (adjusted R2 for CAPM 0.3938; for 3FF 0.6703 and 3MFF 0.6751). Regarding the VGT portfolio, which obtains the best absolute performance measures, we have that the adjusted R2 of the CAPM was 0.805, of the 3-factor model of Fama and French (2014) was 0.8658 and of the 3-factor model of Fama and French (2014) plus momentum was 0.8658. Despite the fact that none of the models obtained the highest degree of fit, the models presented high R2 and adjusted R2 measures.
Regarding the description of ETFs, VTV, also known as Vanguard Value Index Fund ETF Shares is a fund managed by The Vanguard Group, Inc. that invests in shares of large-cap U.S. companies in various economic sectors (following the principle of diversification by sector). This ETF attempts to replicate the performance of the CRSP US Large Cap Value Index - full replication. The VGT, also known as Vanguard Information Technology Index Fund ETF Shares is a fund managed by The Vanguard Group, Inc. that invests in stocks of companies in the U.S. Information Technology sector. The fund attempts to replicate the performance of the MSCI US Investable Market Index (IMI)/Information Technology 25/50 - full replication. The VB, also known as the Vanguard Small-Cap ETF is a fund managed by The Vanguard Group, Inc. that invests in stocks of U.S. companies with high value growth and small market capitalization in various sectors. The portfolio tracks the performance of the CRSP US Small Cap Index - full replica. The USMV, also known as iShares MSCI USA Min Vol Factor is a portfolio managed by BlackRock Fund Advisors that invests in stocks of U.S. companies in various sectors and with diversified market capitalization. The ETF aims to track the performance of the MSCI USA Minimum Volatility (USD) Index - representative sample. PNQI, also known as Invesco NASDAQ Internet is a portfolio managed by Invesco Capital Management LLC. that invests in stocks of U.S. companies in the information technology, software and services, Internet software, IT services, Internet services and infrastructure, software and application software sectors. It focuses on investing in growth and value companies with diversified market capitalization. It aims to track the performance of the NASDAQ CTA Internet index - full replica. MTUM, also known as iShares MSCI USA Momentum Factor is a portfolio managed by BlackRock Fund Advisors that invests in U.S. equities of companies in diversified sectors. It focuses on investing in companies following the momentum factor. It seeks to track the performance of the MSCI USA Momentum SR Variant Index - representative sample. The XBI, also known as the SPDR S&P Biotech ETF is a portfolio managed by SSGA Funds Management, Inc. that invests in stocks of U.S. companies in sectors such as healthcare, pharmaceuticals, biotechnology and life sciences. It focuses on investing in growth and value companies with diverse market capitalizations. It seeks to track the performance of the S&P Biotechnology Select Industry Index - representative sample. The SPHQ, also known as the Invesco S&P 500 Quality ETF, is a portfolio managed by SSGA Funds Management, Inc. that invests in stocks of U.S. companies in diversified industries. It focuses on investing in growth and value companies with large market capitalization. It seeks to track the performance of the S&P 500 Quality Index - full replication. The ESGU, also known as the iShares ESG Aware MSCI USA ETF is a portfolio managed by BlackRock Fund Advisors that invests in U.S. equities of companies in diversified sectors. It focuses on investing in companies that are recognized for their ESG practices. It seeks to track the performance of the MSCI USA Index. The SECT, also known as Main Sector Rotation ETF is a portfolio managed by Main Management ETFs that invests in U.S. equities of companies in diversified, constantly rotating sectors. It focuses on investing in undervalued companies poised to respond favorably to financial market catalysts. The VUG, also known as the Vanguard Growth ETF, is a portfolio managed by The Vanguard Group, Inc. that invests in stocks of U.S. companies in diversified sectors. It focuses on investing in growth-type companies with large market capitalization. It is intended to track the performance of the CRSP US Large Cap Growth Index - full replica. The LRGF, also known as the iShares MSCI USA Multifactor ETF is a portfolio managed by BlackRock Fund Advisors that invests in stocks of U.S. companies in diversified sectors. It focuses on investing in companies with high exposure to factors such as investment, quality, momentum and low size while maintaining a similar level of risk to the base index. It is based on the MSCI USA Index.
# Cleaning the memory
del [fig, fig2, fig3, fig4, fig5, models3FF_r2, df, data, models3FF_r2a]
In conclution, following the approaches of the previous analysis, one should invest regardless of the investor's risk aversion in the VGT ETF since despite not presenting the best degree of fit in the CAPM, 3 Factor models of Fama and French (2014) and including momentum, which seek to explain its returns, it presents the best metrics in all the portfolio performance measures analyzed in this paper, namely: Sharpe Ratio, Treynor Ratio, Jensen's Alpha, Jensen's Alpha from the 3-factor model of Fama and French (2014) and Jensen's Alpha from the 3-factor model of Fama and French (2014) plus momentum. In summary, as evidenced by the exploratory data analysis the ETF VGT not only presents the best performance throughout the period analyzed, but also manages to find models that largely explain its behavior with the help of the factors explored in this paper. For this reason, it should be the best investment option for any investor seeking the highest return with the lowest total or idiosyncratic risk waiver.